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Abstract 

Even though transaction deadlocks can severely affect the perfor- 
mance of distributed database systems, many current evaluation tech- 
niques ignore this aspect. In [4], Shyu and Li proposed an evaluation 
method which takes deadlocks into consideration. However, their tech- 
nique is limited to exclusive locking. In this paper, we extend their 
technique to allow for both shared and exclusive locking. Using this 
technique, we illustrate the impact of deadlocks, in the presence of 
shared locking, on distributed database performance. 


Index Terms: Distributed databases, exclusive locking, performance mod- 
eling, shared locking, static locking, two-phase locking. 
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1 Introduction 


A distributed database system (DDS) is a collection of cooperating nodes 
each containing a set of data objects. A user transaction can enter such a 
system at any of these nodes. The receiving node, often referred to as the 
coordinating node, undertakes the task of locating the nodes that contain 
the data objects required by a transaction. 

In order to maintain database consistency and correctness in the pres- 
ence of concurrent transactions, several concurrency control protocols have 
been proposed [l]. Of these, locking protocols have been widely used in both 
commercial and research environments. In static locking, prior to start of 
execution, a transaction needs to acquire either a shared-lock (for read op- 
erations) or an exclusive lock (for update operations) on each of the relevant 
data objects. When transactions with conflicting lock requests are initiated 
concurrently, they could be possibly blocked due to a deadlock. Deadlocks 
are known to deteriorate performance in both centralized and distributed 
database systems [4,6]. In spite of this, several performance studies have 
ignored the deadlock problem in their analyses [2,5]. 

In [4], Shyu and Li proposed an elegant technique to evaluate the re- 
sponse time and throughput of transactions in a non-replicated DDS. (In the 
rest of the paper, we refer to this as the S-L technique.) Assuming exclusive 
locking (i.e., only write operations), they model the queue of lock requests 
at an object as a M/M/1 queue [3], This results in a closed-form for the 
waiting time distribution at a node, expressed in terms of the average rates 
of arrivals of requests and the average lock-holding time. This technique 
consists of two stages. In the first stage, the average transaction response 
time and throughput are calculated by ignoring the deadlock. This is an 
iterative step that uses the known properties of the M/M/1 queue [3]. In 
the second stage, the probabilities of transaction conflicts and deadlocks are 
computed. These probabilities are used, in turn, to compute the response 
time and throughput in the presence of deadlocks. 

In general, a database transaction reads from a set of data objects (the 
read-set) and writes on to a set of data objects (the write-set). Assuming 
that all accesses are write-only (as in S-L) results in the worst-case per- 
formance (with respect to deadlocks and response time) of a DDS. In this 
paper, we propose to extend the S-L technique to consider both the the read 
and the write operations of database transactions. Using the extended S-L, 
we evaluate the effect of deadlocks on distributed database systems. 
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2 Model 


Except for the inclusion of read operations, our model is the same as in S-L. 

For the sake of completeness, we summarize the DDS model here. 

• There are N nodes and D data objects (or data granules in S-L) in 
a DDS. The D data objects are uniformly distributed across the N 
nodes. A data object may be located at exactly one node. 

• Each transaction accesses K data objects. Among these, r ■ K are 
for read-only purpose, and the rest are for read-write. (Obviously, 
0 < r < 1.) In other words, a transaction must procure r • K shared 
locks and (1 — r) • K exclusive locks. 

• Each data object is equally likely to be accessed by a transaction. 

• Transaction arrivals into the system is a Poisson process with rate A. 

• The communication delay between nodes is exponentially distributed 
with mean i. 

• The average execution time of a transaction, once the locks are ob- 
tained, is S. 


3 Evaluation Procedure 

Since we are only proposing extensions to the S-L model, we do not intend to 
repeat the description of their procedure. Instead, we will discuss only the 
salient features of their procedure that are relevant to describe the proposed 
extensions. 

In Stage 1 of the S-L technique, an iterative procedure is used to eval- 
uate the response time and throughput of a DDS ignoring the possibility 
of deadlocks. In each iteration, the average waiting time (for exclusive lock 
requests) at each of the data objects is computed using estimates of the av- 
erage lock-holding times from the previous iteration. By definition, no two 
exclusive lock requests can have lock grants on the same object simultane- 
ously. Also, assuming that the lock-holding time is exponentially distributed 
(with mean 1/fi) and that the lock request arrivals form a Poisson process 
(with rate A r = A ■ K/D), the distribution of waiting time at an object 
i is expressed as (M/M/1 queueing formula [3]) 

fw,(y) = (i-p)-My) + K(i-p)-e-^~ p)y (i) 
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where po(-) > s the impulse function and p — K/p- Using the waiting time 
distribution, the waiting times at the K data objects are randomly gener- 
ated. These are used, in turn, to derive new estimates for the lock-holding 
times (1 /p). The iterations stop when two successive computations of aver- 
age waiting time estimates are very close. 

When we consider both shared and exclusive locks, the problem of es- 
timating the waiting time distributions becomes difficult. Since two shared 
lock grants on the same object may exist simultaneously, and an exclusive 
lock may not be granted while another shared or exclusive lock is already 
granted, the queueing discipline at a node is complex. Such complex queue- 
ing disciplines are analytically intractable [3]. For this reason, we propose to 
use simulation to solve the queueing model. Given the total rate of arrival 
of lock requests A r , the shared lock ratio (r), and the average lock-holding 
time {l /p), the queue at an object may be simulated. From here, the waiting 
time distribution may be obtained in the form of a table. Once the waiting 
time distribution is obtained, the same iterative procedure as in Stage 1 
of S-L may be adopted to compute the response time when deadlocks are 
ignored. As in S-L, transaction response time is defined as the time between 
the instance the lock requests are sent and the time the last grant request 
is received by the coordinating node. 

In Stage 2, the probabilities of transaction deadlock and restart are com- 
puted. These are then used to compute response time and throughput in 
the presence of deadlocks. When we assume that transactions only make 
exclusive lock requests, the expression for the probability of conflict between 
any two transactions is given by, 


Pc 


1 - 



(2) 


However, when we consider both shared locks and exclusive locks, the prob- 
ability of conflict is reduced. In this case the probability of conflict is given 

by, 
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where K' = r-K and represents the average number of shared locks; (A — A ') 
is the average number of exclusive locks per transaction. Clearly, when 
r = 0, P c = P' c \ when r = 1, P' c - 0; and in all cases, P c > P' c - 

By replacing P c with P' c , the procedure suggested in S-L may be applied 
to obtain the desired performance metrics. 
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4 Results 


Using the extended S-L technique, we obtained a number of interesting 
results that illustrate the effect of deadlocks on database performance. These 
are summarized in Figures 1-5, We have verified our results with those 
obtained in [4] for the all exclusive locks case (r = 0). We make the following 
observations. 

• As expected, the presence of shared locks has a substantial impact on 
the probability of deadlock occurrence (Fig. 1). When only 1/3 of 
the accessed data objects are updated (i.e., r = 2/3), the probability 
of deadlock is considerably small as compared to when all objects are 
updated (r = 0). 

• The observations about the deadlock probabilities are also valid for 
restart probabilities (Fig. 2). 

• Transaction response times are also quite sensitive to the ratio of 
shared locks (Fig. 3). Here, we compare the response times when 
deadlocks are ignored (computed in Stage 1) with those obtained when 
deadlocks are considered (computed in Stage 2). The effect of dead- 
locks is more predominant at higher transaction loads and with smaller 
values of r. When r = 2/3, the effect of deadlocks is not significant on 
response time. 

• The effect of deadlocks on response time is decreased with the increase 
in the number of data items (Fig. 4). Obviously, this is due to the 
decrease in probability of conflicts and hence a decrease in deadlock 
occurrence. For r = 2/3, this effect is almost insignificant. For r = 1/3 
and r = 0, deadlocks seems to have a noticeable effect on response 
time. 

• Fig. 5 summarizes the effect of the number of locks per transaction on 
response time. When K is small, the probability of deadlock is negli- 
gible, and hence its effect on response time is small. At higher values 
of K , the effect of deadlocks on response times is significant. Similarly, 
at smaller values of r, the effect of dedalocks is more apparent. 
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5 Conclusion 


In [4], Shyu and Li presented an elegant technique to evaluate the perfor- 
mance of distributed database systems in the presence of deadlocks. Their 
technique assumed only exclusive locks and thus representing the worst-case 
effects of deadlocks. In this paper, we have extended their technique to al- 
low both shared and exclusive locking. Using the extended technique, we 
evaluated the the effect of number of data objects, the number of data ob- 
jects accessed, and the ratio of read operations on transaction response time. 
These results also indicate the importance of considering both shared and 
exclusive lock requests for response time evaluations. 
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Fig. 2. Restart probability with different 
read ratios 




DC: Deadlock considered. 
DI: Deadlock ignored. 
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Fig. 4 Response lime with high number of data objects. 








